<html>
<head><meta charset="utf-8"><title>Can LLVM truncate a constant? · t-lang/wg-unsafe-code-guidelines · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/index.html">t-lang/wg-unsafe-code-guidelines</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html">Can LLVM truncate a constant?</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="173563641"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173563641" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Lokathor <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173563641">(Aug 19 2019 at 17:46)</a>:</h4>
<p>Suppose that I have a constant for a null terminated utf16 string like this</p>
<div class="codehilite"><pre><span></span><span class="k">const</span><span class="w"> </span><span class="n">HELLO_UTF16_NULL</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="kt">u16</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="mi">104</span><span class="p">,</span><span class="w"> </span><span class="mi">101</span><span class="p">,</span><span class="w"> </span><span class="mi">108</span><span class="p">,</span><span class="w"> </span><span class="mi">108</span><span class="p">,</span><span class="w"> </span><span class="mi">111</span><span class="p">,</span><span class="w"> </span><span class="mi">0</span><span class="p">];</span><span class="w"></span>
</pre></div>


<p>And then in the actual program I only ever access the 0th index of that constant with something like <code>&amp;HELLO_UTF16_NULL[0]</code> to get <code>&amp;u16</code> so that I have "a pointer to the start of the string" for sending over FFI.</p>
<p>Is LLVM ever permitted to truncate my constant? Will it remove all the data past the first index because I'm only using the first index, and then screw up my pointer that's supposed to point to the start of a complete null terminated string?</p>
<p>(Unlike many things we chat about here, this is a question that's less about the Rust Abstract Machine and more about what LLVM itself really guarantees and does.)</p>



<a name="173563882"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173563882" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173563882">(Aug 19 2019 at 17:49)</a>:</h4>
<p>This does somewhat relate to the abstract machine, specifically pointer provenance. If taking a pointer to element zero specifically only grants permission to access that element, not the rest of the allocation (which IIRC is what stacked borrows currently does) then accessing the rest of the array from that pointer is UB even if LLVM doesn't elide the rest of the array. (And conversely, of course, if we did allow it we'd have to make sure LLVM doesn't miscompile that!). Better use <code>.as_ptr()</code> on the slice to be sure! It's more readable anyway IMO.</p>



<a name="173564388"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173564388" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Lokathor <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173564388">(Aug 19 2019 at 17:55)</a>:</h4>
<p><code>as_ptr</code> doesn't seem to be magic, in fact <a href="https://doc.rust-lang.org/src/core/slice/mod.rs.html#372-374" target="_blank" title="https://doc.rust-lang.org/src/core/slice/mod.rs.html#372-374">its definition</a> is about as dull as you can image, just some casting.</p>
<p>If I understand the <code>&amp;HELLO_UTF16_NULL[0]</code> indexing process correctly, it's also grabbing <code>&amp;self</code> on the slice, passing that to <code>&lt;[T] as Index&lt;usize&gt;&gt;::index</code> or however you name that method, which then offsets from the base address by 0 slots and gives that reference back.</p>
<p>So it seems like either both ways should be equally good, or both ways are in equal danger.</p>



<a name="173564610"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173564610" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Lokathor <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173564610">(Aug 19 2019 at 17:58)</a>:</h4>
<p>(the full context of the question is for use in a proc-macro, so the end user would write something like</p>
<div class="codehilite"><pre><span></span><span class="kd">let</span><span class="w"> </span><span class="n">window_title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">L</span><span class="o">!</span><span class="p">(</span><span class="s">&quot;hello&quot;</span><span class="p">);</span><span class="w"></span>
</pre></div>


<p>and then <code>L!</code> would expand to something like</p>
<div class="codehilite"><pre><span></span><span class="p">{</span><span class="w"></span>
<span class="w">  </span><span class="k">const</span><span class="w"> </span><span class="n">LITERAL</span>: <span class="kp">&amp;</span><span class="p">[</span><span class="kt">u16</span><span class="p">]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">[</span><span class="n">the</span><span class="w"> </span><span class="n">string</span><span class="w"> </span><span class="n">literal</span><span class="w"> </span><span class="kr">proc</span><span class="w"> </span><span class="n">expanded</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="kt">u16</span><span class="w"> </span><span class="n">data</span><span class="w"> </span><span class="n">with</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="n">on</span><span class="w"> </span><span class="n">the</span><span class="w"> </span><span class="n">end</span><span class="w"> </span><span class="n">goes</span><span class="w"> </span><span class="n">here</span><span class="p">];</span><span class="w"></span>
<span class="w">  </span><span class="o">&amp;</span><span class="n">LITERAL</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="w"></span>
<span class="p">}</span><span class="w"></span>
</pre></div>


<p>So you get back <code>&amp;'static u16</code>, which will coerce to <code>*const u16</code> as needed)</p>



<a name="173564629"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173564629" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173564629">(Aug 19 2019 at 17:58)</a>:</h4>
<p>I'm not well versed enough in stacked borrows to explain why they differ but see <a href="https://github.com/rust-lang/unsafe-code-guidelines/issues/134" target="_blank" title="https://github.com/rust-lang/unsafe-code-guidelines/issues/134">https://github.com/rust-lang/unsafe-code-guidelines/issues/134</a> for a citation.</p>



<a name="173564937"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173564937" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Lokathor <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173564937">(Aug 19 2019 at 18:02)</a>:</h4>
<p>Hmm, okay, I'm sufficiently convinced of <code>as_ptr</code> over <code>&amp;slice[0]</code>.</p>
<p>but, still, does that make LLVM keep the entire const allocation instead of truncating it? Does LLVM "know" that it's a pointer to the start of an array and it has to keep the whole array?</p>



<a name="173565071"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173565071" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173565071">(Aug 19 2019 at 18:04)</a>:</h4>
<p>It has to if we lower that code correctly, i.e., if the pointer really can be used to access the whole allocation in LLVM's memory model (which should be trivial to achieve)</p>



<a name="173565130"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173565130" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Lokathor <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173565130">(Aug 19 2019 at 18:05)</a>:</h4>
<p>should <em>not</em> be?</p>



<a name="173565144"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173565144" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173565144">(Aug 19 2019 at 18:05)</a>:</h4>
<p>uh, editing mishap, fixed</p>



<a name="173565162"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/173565162" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Lokathor <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#173565162">(Aug 19 2019 at 18:05)</a>:</h4>
<p>ah ha! Okay then.</p>



<a name="174034955"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/174034955" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#174034955">(Aug 24 2019 at 11:13)</a>:</h4>
<blockquote>
<p>as_ptr doesn't seem to be magic, in fact its definition is about as dull as you can image, just some casting.</p>
</blockquote>
<p>The key point is that it casts the entire slice to a raw pointer, and then casts the raw pointer to an element. That's what you want. This is very different from getting a reference to the first element (only permitted to access that one element), and then cast that to a raw pointer.<br>
So yes, it's dull, but it's dull in the right way. <code>&amp;HELLO_UTF16_NULL[0] as *const _</code> as <code>as_ptr()</code> are <em>not</em> equivalent.</p>



<a name="174034962"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can%20LLVM%20truncate%20a%20constant%3F/near/174034962" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/Can.20LLVM.20truncate.20a.20constant.3F.html#174034962">(Aug 24 2019 at 11:13)</a>:</h4>
<p>ah seems you cleared that up, good :)</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>